\chapter{Theory}
\label{chapter:theory}
\section{Diffraction}
The general diffraction of a plane monochromatic, linearly polarized wave with wavelength $\lambda$ can be described
by the Fresnel-Kirchhoff diffraction integral
\beq
  E(\bar{x},\bar{y},z)=\frac{1}{i\lambda}\iint\limits_{-\infty}^{\infty}E(x,y)\frac{\exp{ikr}}{r}\cos(\hat{n},\hat{r})dxdy
  \label{equ:fresnel}
\eeq
with $r=\sqrt{(\bar{x}-x)^2+(\bar{y}-y)^2+z^2}$. Equation \ref{equ:fresnel} describes the field distribution $E(\bar{x},\bar{y},z)$
in a plane $(\bar{x},\bar{y},z)$ behind the diffracting object. 

\subsection{Fresnel Diffraction}
For paraxial rays the direction factor $\cos(\hat{n},\hat{r})$ is 1 and the distance $r$ between a point $(x,y,0)$ in the object plane and
$(\bar{x},\bar{y},z)$ in the image plane is approximately given by z. This leads to the following simplification of \refequ{equ:fresnel}:
\beq
E(\bar{x},\bar{y},z)=\frac{1}{i\lambda z}\iint\limits_{-\infty}^{\infty}E(x,y)\exp(ikr)dxdy
\eeq
A Taylor-expansion of the radius $r=z\sqrt{1+\frac{(\bar{x}-x)^2}{z^2}+\frac{(\bar{y}-y)^2}{z^2}} \approx z + \frac{(\bar{x}-x)^2}{2z}+\frac{(\bar{y}-y)^2}{2z} $
and further simplifying leads to the Fresnel propagation formula:
\beq
E(\bar{x},\bar{y},z)=A(\bar{x},\bar{y},z)\mathcal{F}\left[E(x,y) \exp(\frac{i\pi}{\lambda z}(x^2+y^2))\right](\frac{\hat x}{\lambda z},\frac{\hat y}{\lambda z})
\eeq
\subsection{Fraunhofer Diffraction}
\label{sec:fraunhofer}
In the far field, at distances much larger than the object size, the inequality $z\gg\frac{\pi}{\lambda}(x^2+y^2)$ holds and therefore
$\exp(\frac{i\pi}{\lambda z}(x^2+y^2))$ is approximately 1. The far-field diffraction is therfore calculated as
\beq
  \tilde{E}(v,w,z)=A(\lambda z v, \lambda z w, z) \mathcal{F}E(v,w,0)\label{equ:fraunhofer}
\eeq
Calculating the inverse Fourier transform of a far-field diffraction pattern therefore yields the structure of the scattering object.
Unfortunately Photon detectors measure only the Intensity of the scattered wave:
\beq
  I(v,w,z)=\abs{\tilde{E}(v,w)}^2=\frac{1}{\lambda^2 z^2}\abs{\mathcal{F}E(v,w)}^2
\eeq
The complex phase of the electromagnetic wave is lost and has to be recovered for resolving the structure of the scattering object. A historical 
overview of so called phase-retrieval techniques was given in \refchap{chapter:Introduction}, and in the following the ptychographic
technique shall be explained.
\section{Typical Ptychography setup}
Figure \ref{fig:ptychosetup} shows a typical setup for ptychography. The experiments are usually carried out at synchrotron sources for their high brilliance and coherent X-rays.
\begin{figure}
 \centering
 \includegraphics[width=0.8 \textwidth]{./chapters/theory/diffsetup.png}
 % ptychosetup.png: 533x369 pixel, 96dpi, 14.10x9.76 cm, bb=0 0 400 277
 \caption{A typical ptychography setup: An incoming hard x-ray beam is focused on the object. A pixel array de-
tector records a oversampled diffraction pattern for each raster position of the specimen.}
 \label{fig:ptychosetup}
\end{figure}
In a full experiment, the specimen is scanned with the incoming X-ray photons at positions along circles with expanding radii, as shown in \reffig{fig:objwithpos}. The diffraction patterns are collected in the far field
behind the sample with a CCD detector. 
\section{Ptychographic reconstruction algorithms}
As was shown in \refsec{sec:fraunhofer}, the phase information is lost in the detection process and must be recovered if the structure of the diffracting object is to be reconstructed. In the last decades,
many different algorithms have been developed (hybrid input-output (HIO)\cite{fienup:78}, relaxed averaged alternating reflections (RAAR) \cite{luke2004}, 
error reduction (ER)\cite{fienup:78} and difference map (DM) \cite{Elser:03}) for solving the so-called phase problem. All of these statistics-unaware methods are based on alternating projections between
two different constraint sets, and neglect the fact that the measured signal is contaminated with poisson noise. Recently, \citeauthor{thibault2012} have shown that statistical models that account for the noise,
together with a Maximum Likelihood optimization yield significantly better resolution and signal-to-noise ratios. This thesis is based on a combination of both approaches, and therefore they shall both be introduced
in the next section.
\subsection{Statistics-unaware iterative reconstruction methods}
An iterative procedure for solving the phase problem in Fourier imaging was first stated in 1972 by \citeauthor{GSalgo} \cite{GSalgo}. It solves the phase problem by alternating projections between Fourier- and real 
space, applying a-priori known constraints as shown in \reffig{fig:gsdia}:
\begin{figure}[h!]
 \centering
 \includegraphics[width=0.8 \textwidth]{./chapters/theory/GSdiagram.png}
 % GSdiagram.png: 640x398 pixel, 600dpi, 2.71x1.68 cm, bb=0 0 77 48
 \caption{Schematic of alternating projection algorithms.}
 \label{fig:gsdia}
\end{figure}
The Fourier space constraint is simply that the amplitude of the reconstructed image has to be the square root of the measured intensity. In the different error reduction algorithms the algorithm alternates between
Real- and Fourier space and after the tranformation applies the corresponding constraint to project the current iteration to the space of possible solutions. 
The real space constraint is given by the fact that in order to do a successful reconstruction, the measured object must be smaller than 50 percent of the measured pixels. This way the number of unknown parameters
for a N-pixel image is $N/2$ real and $N/2$ imaginary parts and the problem is not underdetermined anymore, because the N measured diffraction intensities are known. This constraint can either be achieved by imaging 
only small objects \cite{Chapman06,Shapiro2005}, or by illuminating only an isolated area, which is the ptychography approach. This sparsity, or ``support'' constraint after tranforming the the image back to real space.
Even better reconstructions can be produced if one accounts for the noise statistics in a probabilistic approach, which will be shown in the next section.
\subsection{Statistical treatment: Maximum-Likelihood formulation of Ptychography}
This section introduces a maximum likelihood formalism for ptychography, as first proposed in \cite{thibault2012}.
The two-dimensional complex-valued illumination function $P_{\vec r}$ (subsequently called ``probe'') represents the amplide and phase of the illumination at the plane of the sample. The sample-wavefield interaction is
modelled by a multplicative transmission function $O_{\vec{r}}$ (subsequently called ``object''). At the exit of the sample, the wavefield is given by
\beq
 \psi_{j\vec{r}} = P_{\vec{r}-\vec{r_j}}O_{\vec{r}}
\eeq
with $\vec r= (x,y)$, a coordinate in the plane normal to the prpagation direction of the wave. $j$ is the label of the probe position relative to the object. As shown in \refsec{sec:fraunhofer}, 
the intensity at position $\vec{q} = (v,w)$ in the far-field plane is 
\beq
I_{j\vec{q}}= \abs{\tilde{\psi}_{j\vec{q}}}=\abs{\ft{P_{\vec{r}-\vec{r_j}}O_{\vec{r}}}}^2
\label{equ:I}
\eeq
For practcal applications, $\vec{r}$ and $\vec{q}$ are discrete coordinates on a finite rectangular grid, and the Fourier transform is replaced by a Fast Fourier Transform. The reconstruction is to find a pair 
$(P_{\vec r},O_{\vec{r}})$ that satisfies the system of equations \ref{equ:I}. In the case where only shot noise contributes to the measured photon count, the probability of measuring $n_{j\vec{q}}$ photons,
given the probe and the object function is given by a Poisson-Distribution.
\beq
 p(n_{j\vec{q}}|P_{\vec{r}},O_{\vec{r}}) = \frac{(I_{j\vec{q}})^{n_{j\vec{q}}}}{n_{j\vec{q}}!}e^{-I_{j\vec{q}}}
\eeq
For reasonably high photon counts, the Poisson distribution approaches a Normal distribution (reference). 
\beq
 p(n_{j\vec{q}}|P_{\vec{r}},O_{\vec{r}}) = \frac{1}{\sqrt{2 \pi \sigma^2_{j\vec{q}}}}\exp(-\frac{(I_{j\vec{q}}-n_{j\vec{q}})^2}{2 \sigma^2_{j\vec{q}}})
\eeq
Because the logarithm is a monotonous function, maximization of the likelihood of this distribution is equivalent to the minimization of the negative log-likelihood $\mathcal{L}$, which is often used for 
practical purposes, because optimization problems are usually stated as minimization problems.
\beq
\mathcal{L} = -\log \prod_{j} \prod_{\vec q}  p(n_{j\vec{q}}|P_{\vec{r}},O_{\vec{r}}) =\sum_{j}\sum_{\vec{q}} \frac{w_{j\vec{q}}}{2\sigma^2_{j\vec{q}}}(I_{j\vec{q}}-n_{j\vec{q}})^2 + const.
  \label{equ:logll}
\eeq
For later purposes we might write the log-likelihood as $\mathcal{L} = \sum_{j}\sum_{\vec{q}} \mathcal{L}_{j\vec{q}}$.
The weight factor $w_{j\vec{q}}$ was introduced, because the sum must only include valid measurements, so $w_{j\vec{q}}$ is set to 1 for valid pixels and to 0 for unmeasured regions.

\section{Finite difference} \label{sec:diff}
In the course of this thesis, finite differences will be used to calculate discrete derivatives, so they shall be introduced here shortly. In analysis, the partial derivative of a multivariate function
is given by
\beq
\pardevS{x}{f(x,y)} = \lim_{\eps\rightarrow 0} \frac{f(x+\eps,y)-f(x,y)}{\eps}
\eeq
There are several ways to discretize this derivative for numerical purposes. In this thesis we use the central difference
\beq
  \delta_{(h,x)}[f](x,y) = f(x+\frac{1}{2}h,y)f(x-\frac{1}{2}h,y)
\eeq
to form the partial derivative
\beq
  \frac{\delta_{(h,x)}[f](x,y)}{h} = \pardevS{x}{f(x,y)} + O(h^2)
\eeq
In the same way, second derivatives can be computed:
\beq
  \partial_x^2 f(x,y) \approx \frac{\delta_{(h,x)}^2[f](x,y)}{h^2} = \frac{f(x+h,y)-2f(x,y)+2f(x-h,y)}{h^2}\\
\eeq
\beqan
  \partial_{xy} f(x,y) = \partial_{xy} f(x,y) &\approx  \frac{f(x+h,y+h)-f(x+h,y-h)}{4 h^2} \\ &+\frac{-f(x-h,y+h)+f(x-h,y-h)}{4 h^2}
\eeqan
Central differences will be used for the computation of the Hessian in \refchap{chapter:subpix}.







